Fix non-range HTTP download: downloaded bytes exceeding total file size#1307
Fix non-range HTTP download: downloaded bytes exceeding total file size#1307Copilot wants to merge 5 commits into
Conversation
When HTTP server doesn't return Content-Length or doesn't support Range requests, the download can accumulate extra bytes on pause/resume because: 1. Each retry/resume re-downloads from the beginning but writes at the accumulated offset 2. No size limit enforcement for non-range downloads 3. onDownloadComplete incorrectly uses chunk.remain() for non-range downloads, which gives negative values for chunk(0,0) Fixes: - Reset chunk state in downloadChunkOnce and runConnectionFallback for non-range downloads since server will send from beginning - Add size limit checks in download loops when Size is known - Fix onDownloadComplete to use conn.Completed for non-range downloads instead of unreliable chunk.remain() - Add test cases for non-range download, pause/resume, and size limit Co-authored-by: monkeyWie <13160176+monkeyWie@users.noreply.github.com>
… goroutine The go f.waitForCompletion() goroutine in expandConnections was redundant since the downloadLoop's slow start loop already handles stateSteady by calling waitForCompletion directly. The orphaned goroutine could race with doStart() when writing f.ctx during pause/resume. Co-authored-by: monkeyWie <13160176+monkeyWie@users.noreply.github.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1307 +/- ##
==========================================
- Coverage 73.70% 73.58% -0.13%
==========================================
Files 48 48
Lines 6192 6253 +61
==========================================
+ Hits 4564 4601 +37
- Misses 1208 1228 +20
- Partials 420 424 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@copilot 不对哦,我抓包看那个http请求都没有返回Content-Length,所以你这个单测也模拟的不对,你可以试试这个链接:https://github.com/GopeedLab/gopeed/archive/refs/heads/main.zip |
…l bug scenario Remove StartTestNoRangeServer and its tests since the real bug is about servers that don't return Content-Length at all (like GitHub's zip downloads), not servers with Content-Length but no Range support. The chunked tests (TestFetcher_DownloadChunkedContinue and TestFetcher_DownloadChunkedNotExceedSize) correctly simulate the real-world bug using StartTestCustomServer which serves files without Content-Length. Co-authored-by: monkeyWie <13160176+monkeyWie@users.noreply.github.com>
已修复,移除了 |
|
@copilot Chunked下载之前已经有单测了,不需要再添加了,要分析这种bug偶发出现的根本原因然后修复,然后如果能针对性的对于这个bug编写一个单测才有意义 |
…test Remove TestFetcher_DownloadChunkedContinue (redundant with existing tests). Replace TestFetcher_DownloadChunkedNotExceedSize with TestFetcher_DownloadNoRangePauseResume which specifically targets the bug: - Documents the root cause (non-range resume writes at accumulated offset instead of byte 0) - Asserts downloaded file size does not exceed source file size (the exact bug symptom from the issue) - Verifies file integrity via MD5 Co-authored-by: monkeyWie <13160176+monkeyWie@users.noreply.github.com>
已移除冗余的 根本原因: 非 Range 下载在 pause/resume 后,服务器从 byte 0 重新发送数据,但代码在之前累积的偏移量处写入,导致数据被追加而不是覆盖,文件大小无限增长。 单测验证: 使用不返回 Content-Length 的服务器,暂停后恢复,断言下载文件大小不超过源文件大小(即 bug 的具体表现),并校验 MD5。 提交: 0bd3067 |
When a server doesn't return
Content-Lengthor doesn't supportAccept-Ranges, pause/resume causes downloaded bytes to grow unboundedly past the actual file size (e.g. 81 MB / 35 MB at 231%).Root cause
On resume, the server resends from byte 0 (no
Rangesupport), but the code wrote at the previously accumulated offset (conn.Chunk.Downloaded), appending a second copy instead of overwriting. Each pause/resume cycle roughly doubled the file. This bug occurs sporadically because it requires the user to pause and resume a download from a server that doesn't support Range requests.Non-range retry writes at wrong offset
Each retry in
downloadChunkOncemakes a request withoutRangeheader (server doesn't support it), so the server sends from byte 0. But the code writes atconn.Chunk.Downloaded(accumulated from prior attempts), appending instead of overwriting. Every retry roughly doubles the file.Fix: Reset
conn.Chunk.Downloadedandconn.Downloadedto 0 at the start ofdownloadChunkOnceandrunConnectionFallbackfor non-range downloads, since the server restarts the transfer each time.No size limit enforcement for non-range downloads
downloadChunkOnceonly checkschunk.remain()whenRange=true. WhenSize > 0butRange=false, data is written without bound.Fix: Add size limit checks in all three download loops (
downloadChunkOnce,runConnectionWithResolveResp,runConnectionFallback) whenSize > 0, regardless of range support.Premature completion masking the error
onDownloadCompleteuseschunk.remain()to detect incomplete chunks. For non-range chunks(0,0), any downloaded bytes makeremain()negative, so a timed-out connection is incorrectly treated as successfully completed.Fix: For non-range downloads, use
conn.Completed(set only on EOF) instead ofchunk.remain()to determine completion.Data race in
expandConnectionsgo f.waitForCompletion()launched an orphaned goroutine that could race withdoStart()writingf.ctxon resume. This goroutine was redundant—thedownloadLoopslow-start loop already callswaitForCompletion()after checkingstateSteady.Test
TestFetcher_DownloadNoRangePauseResume— targeted test that usesStartTestCustomServer()(noContent-Length, noAccept-Ranges) to simulate real-world servers like GitHub's zip downloads, performs pause/resume, then asserts that the downloaded file size does not exceed the source file size and that the file content matches via MD5.Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.